Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(stream): use columnar eval for temporal join non-lookup conds #15228

Merged
merged 19 commits into from
Feb 27, 2024

Conversation

TennyZhuang
Copy link
Contributor

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Will do a benchmark later

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@TennyZhuang
Copy link
Contributor Author

Note: Check "Hide whitespace" can significantly improve your review experience.
image

@TennyZhuang TennyZhuang requested a review from xxchan February 23, 2024 09:43
Signed-off-by: TennyZhuang <[email protected]>
@TennyZhuang TennyZhuang changed the title refactor(stream): use columnar eval for temporal join non-eq conds refactor(stream): use columnar eval for temporal join non-lookup conds Feb 23, 2024
Copy link
Contributor

@wangrunji0408 wangrunji0408 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM

Comment on lines 406 to 411
#[try_stream]
async {
#[allow(unreachable_code)]
if false {
return Err(unreachable!("type hints only") as StreamExecutorError);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really weird. 😇
Is it possible to use #[try_stream(error = StreamExecutorError)]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They don't support that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's non-trivial to declare a variable as impl Trait type. TAIT must be enabled for that, and there are many corner cases. So I guess the macro author don't want to do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll create a helper macro to generate the code.

let ArrayImpl::Bool(bool_array) = &*filter else {
panic!("unmatched type: filter expr returns a non-null array");
};
let new_vis = bool_array.to_bitmap() | (!row_matched);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why non-matched rows are visible in the result? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LEFT OUTER JOIN

Copy link
Contributor Author

@TennyZhuang TennyZhuang Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-match rows will only be appended to the chunk when it's a LEFT OUTER join, so they should be visible here.

Signed-off-by: TennyZhuang <[email protected]>
@TennyZhuang
Copy link
Contributor Author

Need to handle i2o_mapping after the second phase.

@yuhao-su
Copy link
Contributor

@chenzl25 Is it possible the LHS be non append only in the future? If yes doing columnar eval can be more complicated then

Signed-off-by: TennyZhuang <[email protected]>
@yuhao-su
Copy link
Contributor

If LHS matches a row in RHS and in st1, but the row is filter out in st2, we should expect a matched empty row in the result.

@TennyZhuang
Copy link
Contributor Author

If LHS matches a row in RHS and in st1, but the row is filter out in st2, we should expect a matched empty row in the result.

Recorded in #15257

The PR doesn't make the behavior worse, I guess we can merge it first.

Comment on lines +417 to +423
#[try_stream]
async {
#[allow(unreachable_code)]
#[allow(clippy::diverging_sub_expression)]
if false {
return unreachable!("type hints only") as StreamExecutorResult<_>;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to extract a method for this async block? The hack here looks ugly🤣

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer to keep the hack code in a small scope and use it only needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is very acceptable. I like the simple and stupid way

@chenzl25
Copy link
Contributor

@chenzl25 Is it possible the LHS be non append only in the future? If yes doing columnar eval can be more complicated then

Yes, we do want to support non-append only temporal join in the future. #15218

@TennyZhuang
Copy link
Contributor Author

TennyZhuang commented Feb 27, 2024

image
Nexmark result with a simple non-lookup condition.

@TennyZhuang TennyZhuang added this pull request to the merge queue Feb 27, 2024
github-merge-queue bot pushed a commit that referenced this pull request Feb 27, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 27, 2024
@TennyZhuang TennyZhuang added this pull request to the merge queue Feb 27, 2024
Merged via the queue into main with commit 791c0c8 Feb 27, 2024
26 of 27 checks passed
@TennyZhuang TennyZhuang deleted the temporal-join-chunk-eval branch February 27, 2024 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants